Dimensions of Grammatical Coreference

نویسنده

  • Peter C. Gordon
چکیده

The correlational structure of judgments of grammatical coreference is examined using factor analysis and the results are used to identify the dimensions of grammatical variation in competent speakers of English. The dimensions that are discovered do not correspond to those typically discussed in generative linguistics but they can be explained very naturally by a model in which coreference is achieved through a process in which linguistic expressions are mapped onto a model of discourse. Intuitions of grammaticality constitute the most basic data to be explained by theory in generative linguistics. This has been so since Chomsky (1965) argued that a competent speaker-hearer's implicit knowledge of a language provides the best path to characterizing the essential, generative capacity of a grammar. While generative linguists have accepted intuitions of grammaticality as their basic data, they have by-and-large eschewed the development of formal methods for assessing those intuitions. There have been occasional, interesting attempts to apply more formal methodology to the study of grammatically significant intuitions, but these have not had much impact on linguistic theory (Shutze, 1996). Psychologists studying intuitions of grammatical (and other types of) well-formedness have come to characterize such studies as "offline" and to regard them primarily as ways of validating materials being used in online studies designed to reveal moment-to-moment processing of language. We believe that more systematic use of formal methods for studying intuitions of grammaticality can be of real value both to formal theories of grammar and to models of language processing. The present paper provides support for this belief by showing how applying scaling techniques to judgments of grammaticality can reveal how different types of linguistic forms give rise to dimensions of grammaticality in the domain of coreference. In previous work (Gordon & Hendrick, 1997; Gordon & Hendrick, in pressb), we have applied elementary techniques of experimental psychology to the study of judgments of when coreference between two noun phrases (NPs) is grammatically acceptable. The results of these studies were analyzed by calculating the mean acceptability of coreference for different types of NPs in different syntactic relations. The pattern of acceptable coreference in some cases provided support for basic claims presented in the Binding Theory (Chomsky, 1981; 1986) but in other cases did not. In the present work we use factor analysis as a scaling tool for revealing the dimensions underlying grammatical coreference in a community of competent speakers of English. The dimensions that emerge provide information about which forms of referring expressions vary together in their ease of coreferential interpretation. The resulting classification of types of referring expressions is not consistent with central theoretical principles in generative linguistics (Chomsky, 1981; 1986; Evans, 1980; May, 1985), but is consistent with a model that treats the acceptability of coreferential interpretation as emerging from the ease with which a discourse model can be dynamically constructed from linguistic input containing different types of referring expressions (Gordon & Hendrick, in pressa). Coreference With Names and Pronouns In Gordon and Hendrick (1997) we report a series of surveys of intuitive judgments of coreference designed to test the adequacy of Principle C of the Binding Theory. Those surveys systematically investigated coreference possibilities in sentences with coreference between namepronoun, name-name and pronoun-name sequences; the structures were systematically varied as to whether the coreferential elements were in a c-command relation or not. (A constituent  is said to c-command another constituent  if the first branching node that dominates dominates as well. Principle C states that a name or definite description cannot have a c-commanding antecedent.) Table 1 shows the types of coreferential configurations examined and sample stimuli from the fourth experiment in that work. Subjects were asked to use a six-point scale to rate the grammatical acceptability of coreferential interpretation of the two boldfaced words. The experiment manipulated the linear order of names and pronouns, and whether a ccommand relation existed between them. The implications of the pattern of means are discussed in Gordon and Hendrick (1997). Here we examine how individual variation in grammaticality judgments can reveal underlying dimensions of grammaticality. Factor analysis is a statistical tool for capturing the correlational structure in a set of data by determining how linear combinations of observed variables can account for the pattern of observed correlations between these variables. It can be used either in an exploratory or confirmatory manner. For current purposes, we have performed an exploratory factor analysis of the data from this experiment. Table 2 shows the correlations in subjects’ ratings between sentences in the six basic coreference conditions. A very substantial positive correlation of .76 was observed between ratings for the Name-Name sentences in the c-command and no ccommand conditions. Another very substantial positive correlation of .68 was observed between ratings for the Pronoun-Name sentences in the c-command and no ccommand conditions. Two other correlations were smaller but still significant: Name-Pronoun sentences in the two ccommand conditions (.32) and c-commanded NamePronoun sentences with c-commanded Pronoun-Name sentences (-.30). No other correlations approached significance. The strong correlations suggest that our subjects showed reliable individual differences along clear syntactic dimensions. A factor analysis simplifies the pattern in this correlation matrix by determining how linear combinations of the observed variables can account for the pattern of observed correlations between those variables. Factor analysis of these data revealed three factors with eigenvalues greater than one, which together accounted for 82.3 percent of the predictable variance in the matrix. Table 3 shows the resulting factor matrix. The absolute value of the numerical entries for the factors indicates on a scale of zero to one the extent to which individual coreference conditions contribute to the factors. Interpretation of the pattern of these weights provides the basis for labeling the factors. Accordingly, Factor 1 can be called the “Name-Pronoun” factor because it depends on Name-Pronoun sequences of NPs. Factor 2 can be called the “Pronoun-Name” factor because it very clearly depends on the “Pronoun-Name” sequences of NPs, with a smaller contribution of the c-commanded Name-Pronoun sequences. Factor 3 can be called the “Name-Name” factor, because it depends on the Name-Name sequences of NPs. The very clear cut pattern of these factors suggests that different principles govern the grammaticality of these different sequences of NPs. The correlational results (Table 2) and the subsequent factor analysis (Table 3) provide a direct window on the systematic variation of acceptable coreference in a community of competent users of English. They show that there are three independent factors along which individuals reliably vary in their willingness to accept coreference between two NPs. These factors are related to the three sequences of types of NPs that we explored: Name-Pronoun, Pronoun-Name and Name-Name. On each factor a subject’s criterion for accepting coreference with or without cTable 1. Sample stimuli and summary results (grammatical acceptability of a coreferential interpretation of the boldfaced words on a 1 to 6 scale) from Experiment 4 of Gordon and Hendrick (1997). Type of Sequence Example Stimuli C-Command Average Name-Pronoun Lisa’s brother visited her at college. No 4.47 Name-Pronoun Lisa visited her brother at college. Yes 5.32 Name-Name Lisa’s brother visited Lisa at college. No 4.12 Name-Name Lisa visited Lisa’s brother at college. Yes 3.50 Pronoun-Name Her brother visited Lisa at college. No 2.70 Pronoun-Name She visited Lisa’s brother at college. Yes 2.44 Table 2. Correlations in subjects’ mean grammaticality ratings for the different configurations of referring expressions for the data in Table 1. N = Name, P = Pronoun, Yes = C-Command, No = no C-Command. N-P: Yes N-N: No N-N: Yes P-N: No P-N: Yes

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phrase Structures and Dependencies for End-to-End Coreference Resolution

We present experiments in data-driven coreference resolution comparing the effect of different syntactic representations provided as features in the coreference classification step: no syntax, phrase structure representations, dependency representations, and combinations of the representation types. We compare the end-to-end performance of a parametrized state-of-the-art coreference resolution ...

متن کامل

Investigating the relationship among complexity, range, and strength of grammatical knowledge of EFL students

Assessment  of  grammatical  knowledge  is  a  rather  neglected  area  of  research  in  the  field with  many  open  questions  (Purpura,  2004).  The  present  research  incorporates  recent proposals  about  the  nature  of  grammatical  development  to  create  a  framework  consisting of dimensions of complexity, range and strength, and studies which dimension(s) can best predict the stat...

متن کامل

Effects of contrastive intonation and grammatical aspect on processing coreference in Mainstream American English

Coreference choices are influenced by multiple factors, including information structural categories such as topic and focus. These information structural categories can be indicated by intonation, yet few studies have investigated how intonation affects subsequent choices for coreference. Using a story continuation experiment with aurally presented stimuli, we show that the location of contrast...

متن کامل

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012